Multi Feature Space Combination for Authorship Clustering
نویسندگان
چکیده
The Author Identification task for PAN 2016 consisted of three different Sub-tasks: authorship clustering, authorship links and author diarization. We developed a machine learning approaches for two of three of these tasks. For the two authorship related tasks we created various sets of feature spaces. The challenge was to combine these feature spaces to enable the machine learning algorithms to detect these difference authors across multiple feature spaces. In the case of authorship clustering we combine these feature spaces and use a two-step approach for clustering. Then we use results of the clustering, and employ new feature space to determine links between documents in given problems.
منابع مشابه
Optimum Ensemble Classification for Fully Polarimetric SAR Data Using Global-Local Classification Approach
In this paper, a proposed ensemble classification for fully polarimetric synthetic aperture radar (PolSAR) data using a global-local classification approach is presented. In the first step, to perform the global classification, the training feature space is divided into a specified number of clusters. In the next step to carry out the local classification over each of these clusters, which cont...
متن کاملMLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملروش جدید تقطیع تصویر بر مبنای خوشهبندی فازی مبتنی بر تکامل تفاضلی چندهدفه
Image segmentation is one of the most important and difficult steps in machine vision problems and achieving the desired results often requires satisfaction of different objectives. One approach to face this situation uses multi-objective fuzzy clustering of pixels in the feature space. This paper proposes a new strategy for search within the family of multi-objective differential evolution alg...
متن کاملVote/Veto Classification, Ensemble Clustering and Sequence Classification for Author Identification
The Author Identification task for PAN 2012 consisted of three different sub-tasks: traditional authorship attribution, authorship clustering and sexual predator identification. We developed three machine learning approaches for these tasks. For the two authorship related tasks we created various sets of feature spaces, where individual differences in writing styles are assumed to surface in ju...
متن کاملیک روش مبتنی بر خوشهبندی سلسلهمراتبی تقسیمکننده جهت شاخصگذاری اطلاعات تصویری
It is conventional to use multi-dimensional indexing structures to accelerate search operations in content-based image retrieval systems. Many efforts have been done in order to develop multi-dimensional indexing structures so far. In most practical applications of image retrieval, high-dimensional feature vectors are required, but current multi-dimensional indexing structures lose their effici...
متن کامل